Picture for Yi Wu

Yi Wu

AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models

Add code
Jan 31, 2026
Viaarxiv icon

From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents

Add code
Jan 30, 2026
Viaarxiv icon

JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG

Add code
Jan 29, 2026
Viaarxiv icon

Self-Compression of Chain-of-Thought via Multi-Agent Reinforcement Learning

Add code
Jan 29, 2026
Viaarxiv icon

Scaling Test-time Inference for Visual Grounding

Add code
Jan 20, 2026
Viaarxiv icon

Anti-Length Shift: Dynamic Outlier Truncation for Training Efficient Reasoning Models

Add code
Jan 07, 2026
Viaarxiv icon

Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals

Add code
Oct 16, 2025
Figure 1 for Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals
Figure 2 for Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals
Figure 3 for Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals
Figure 4 for Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals
Viaarxiv icon

SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment

Add code
Sep 04, 2025
Figure 1 for SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Figure 2 for SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Figure 3 for SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Figure 4 for SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Viaarxiv icon

ACT: Automated Constraint Targeting for Multi-Objective Recommender Systems

Add code
Sep 03, 2025
Viaarxiv icon

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

Add code
Aug 13, 2025
Viaarxiv icon